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ABSTRACT 

A finite difference version of 
the equations governing two- 
dimensional, non— di vergent flow 
on a sphere is implemented and 
integrated on the MPP. The MPP's 
performance is then compared with 
the CYBER's. 

Keywords: Numerical weather 
prediction, computational fluid 
dynamics. 


INTRODUCTION 


The purpose of the work described 
here was to demonstrate the feas- 
ibility of using a massively par- 
allel architecture to solve the 
hydrodynamic equations as they 
are used in numerical weather 
prediction (NWP) . 

Models used in NWP are commonly 
divided in two parts: the "dyna- 
mics" and the "physics". The 
dynamics performs the time inte- 
gration of the equations of mot- 


ion. The physics computes the 
heating, friction, and sources 
snd sinks of water vapor. These 
two parts present very different 
problems to a highly parallel 
machine. 

Many of the calculations in the 
dynamics involve the parallel 
updating of the many degrees of 
freedom allowed in the discreti- 
zation and are thus very suitable 
to a machine like the MPP. 
Occasionally, however, it is 
necessary to obtain a spectral 
transform or solve an elliptic 
equation. These problems, al- 
though parallel, are non-local 
and thus difficult to implement 
efficiently on the MPP's nearest 
neighbor network. Fortunately, 
the non-local calculations can be 
minimized by a suitable choice of 
numerical scheme. For example, 
grid-point models, in which the 
equations are finite differenced 
in a latitude— longitude lattice, 
are much preferable to spectral 
models, which require frequent 
transformations between physical 
and spectral space. Still, non- 
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local calculations are not com- 
pletely avoidable. In particular 
they appear in the solution of 
elliptic equations that occur 
when implicit time differencing 
schemes are used. Although these 
too could be avoided by using an 
explicit method (which is in fact 
done in many models, even on 
serial computers) , we feel the 
architecture should not be so 
specialized as to completely for 
bid such choices. 

Problems in the physics part of 
the codes are probably even more 
serious. In these, it is their 
non-parallel, rather than non- 
local, nature that makes for 
difficulties. As an example con- 
sider condensation. In roost 
models this is done level by 
level, testing for super — satura- 
tion and passing the excess water 
to the next level below. That 
level in turn may become super — 
saturated, or may have been so 
already. The condensation calcu- 
lation is then repeated and so on 
until "rainfall*' reaches the sur — 
face. If parallelism is 
exploited by mapping each latitu- 
de-longitude point onto a dif- 
ferent processor (this is really 
the only practical alternative in 
a machine with as many processors 
as the MPP) , each one will in 
general encounter different con- 
densation conditions. Processors 
at all grid points where there is 
no condensation, for example, 
will be idle in this segment of 
the code, and parallelism will be 
1 ost. 

THIS STUDY 

To start looking at the problems 
one faces with a parallel archi- 
tecture, we decided to use the 
barotropic vorticity equation as 
a model of the "dynamics" part of 
NWP models. In this way we can 
test both the parallel grid-point 


updating segments and the more 
challenging problem of solving an 
elliptic equation. 

At each step of the calculation 
we update the following equation 
for a new value of the vorticity: 
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where J is the vorticity, and a 
and v are the zonal and meridio- 
nal velocity components of the 
non-diver gent flow, 9 and X are 
the latitude and longitude, and f 
is the Coriolis parameter. As 
mentioned already, <1> is solved 
by finite-differencing on a lati- 
tude longitude grid. A leap-frog 
differencing scheme is used in 
time. Once a new value of the 
vorticity is obtained from the 
discrete version of (1), the 
Poisson equation: 


= J <2> 


is solved for the stream— func- 
tion. To solve (2) we use a 
"fast" method in which the equa- 
tions are first Fourier 
transformed in the zonal direc- 
tion, then finite differenced in 
the meridional direction and sol- 
ved as a set of tri -diagonal sys- 
tems. The velocity components u 
and v are then obtained from 
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Having u and v, (1) can be up- 
dated again and the cycle com- 
pleted. 

To test the model, <1) was forced 
with sources of angular momentum 
and eddy vorticity, and damped by 
a linear drag. 

Tests were conducted in parallel 
on the MPP and the CYBER 205 at 
Goddard Space Flight Center. The 
CYBER calculations were done with 
HALF-PRECISION (32-bit) arithme- 
tic. Both MPP and CYBER codes 
were optimized for their machines 
to the best of our abilities; but 
both used exactly the same algo- 
rithm. In particular, the "fast" 
solver used for (2) , which is 
very efficient on the CYBER, was 
retained on the MPP. On the 
other hand, a 128x128 square grid 
was used in both cases. This is 
optimal for the MPP. Higher reso- 
lution would require either doing 
a prohibitive amount of I/O, or 
j keeping more than one grid-point 
per processor, which is not pos- 
i sible with the MPP's limited mem- 
ory. The CYBER efficiency, in 
contrast, is independent of reso- 
lution for all practical choices. 


RESULTS 


The timing results are shown in 
Table I. We have separated these 
in two parts: the time spent sol- 
ving the Poisson equation (2) , 
and all the rest, which is mostly 
computing the right-hand-side of 
(1) and a little housekeeping. 
Units are msec. /timestep. At the 
resolution used, we were taking 
200 time steps per day. As may be 


seen, the code is approximately 
four times slower on the MPP than 
on the CYBER. This poor per for — 
(nance, however, is due entirely 
to the Poisson salver, which runs 
some ten times slower on the MPP. 
The updating of the vorticity 
equation is twice as fast on the 
MPP. This is a very encouraging 
result. 
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TABLE I 


If the NWP model is grid-point 
and uses explicit time differenc- 
ing, the elliptic solver is not 
needed, and the MPP (or an MPP— 
like machine) should do very well 
in the dynamics. However, even 
if the model is implicit, and one 
or several elliptic equations 
have to be solved, the situation 
is not as bad as Table I would 
indicate. In a typical situation 
we would be solving some 40 equa- 
tions like (1) (4 variables 

Cu,v,T,ql at 10 levels), but at 
most 10 equations like (2). Using 
these figures, we can extrapolate 
our results to a full, grid- 
point, semi— implicit NWP model. 
This is shown in Table II. As may 
be seen, the situation is much 
improved; the MPP is now at near 
CYBER performance, even doing all 
ten vertical modes implicitly. 
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Obviously, much work remains to 
be done before massively parallel 
machines can be used efficiently 
for numerical weather prediction. 
In particular, it is imperative 
that much more parallel formula- 
tions and/or algorithms be deve- 
loped for the physics codes, a 
problem we have not even begun to 
address here. Nevertheless, we 
feel that the results presented 
indicate a very real possibility 
of using MPP— like machines in 
NWP. 
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